A New Optimal Stepsize for Approximate Dynamic Programming
نویسندگان
چکیده
منابع مشابه
A New Optimal Stepsize Rule for Approximate Value Iteration
Approximate dynamic programming (ADP) has proven itself in a wide range of applications spanning large scale transportation problems, scheduling, health care and energy systems. Approximate value iteration is a particularly attractive ADP strategy because it reduces problems with long horizons into sequences of relatively small linear or integer programs that are relatively easy to solve. At th...
متن کاملStepsize Selection for Approximate Value Iteration and a New Optimal Stepsize Rule
Approximate value iteration is used in dynamic programming when we use random observations to estimate the value of being in a state. These observations are smoothed to approximate the expected value function, leading to the problem of choosing a stepsize (the weight given to the most recent observation). A stepsize of 1/n is a common (and provably convergent) choice. However, we prove that it ...
متن کاملOptimal Learning and Approximate Dynamic Programming
Approximate dynamic programming (ADP) has emerged as a powerful tool for tackling a diverse collection of stochastic optimization problems. Reflecting the wide diversity of problems, ADP (including research under names such as reinforcement learning, adaptive dynamic programming and neuro-dynamic programming) has become an umbrella for a wide range of algorithmic strategies. Most of these invol...
متن کاملA New Hybrid Critic-training Method for Approximate Dynamic Programming
A variety of methods for developing quasi-optimal intelligent control systems using reinforcement learning techniques based on adaptive critics have appeared in recent years. This paper reviews the family of approximate dynamic programming techniques based on adaptive critic methods and introduces a new hybrid critic training method.
متن کاملSequential Bayesian optimal experimental design via approximate dynamic programming
The design of multiple experiments is commonly undertaken via suboptimal strategies, such as batch (open-loop) design that omits feedback or greedy (myopic) design that does not account for future effects. This paper introduces new strategies for the optimal design of sequential experiments. First, we rigorously formulate the general sequential optimal experimental design (sOED) problem as a dy...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Automatic Control
سال: 2015
ISSN: 0018-9286,1558-2523
DOI: 10.1109/tac.2014.2357134